Distributed and hierarchical neural encoding of multidimensional biological motion attributes in the human brain

Abstract The human visual system can efficiently extract distinct physical, biological, and social attributes (e.g. facing direction, gender, and emotional state) from biological motion (BM), but how these attributes are encoded in the brain remains largely unknown. In the current study, we used functional magnetic resonance imaging to investigate this issue when participants viewed multidimensional BM stimuli. Using multiple regression representational similarity analysis, we identified distributed brain areas, respectively, related to the processing of facing direction, gender, and emotional state conveyed by BM. These brain areas are governed by a hierarchical structure in which the respective neural encoding of facing direction, gender, and emotional state is modulated by each other in descending order. We further revealed that a portion of the brain areas identified in representational similarity analysis was specific to the neural encoding of each attribute and correlated with the corresponding behavioral results. These findings unravel the brain networks for encoding BM attributes in consideration of their interactions, and highlight that the processing of multidimensional BM attributes is recurrently interactive.


Introduction
Precisely perceiving and understanding multidimensional information from other people's movements play a vital role in our survival and social interaction (Yovel and O'Toole 2016).Bodily movements contain plentiful and hierarchical attributes, including physical attributes like direction and speed, biological attributes like gender and age, and social attributes like affect and intention.Since the point-light biological motion (BM), which isolates human body movements, was introduced by Johansson (1973), a large body of research has revealed that the human visual system can efficiently extract and process distinct physical, biological, and social attributes from BM, such as facing direction (Saunders et al. 2010;Wang et al. 2020), gender (Kozlowski and Cutting 1977;van der Zwan et al. 2009), and emotional state (Atkinson et al. 2004;Clarke et al. 2005;Parkinson et al. 2017).Despite that the ability to recognize and evaluate different BM attributes has been well demonstrated at the behavioral level, it remains largely unknown how the brain simultaneously encodes distinct attributes from BM.
Previous studies have identified a large brain network dedicated to BM processing, involving the posterior superior temporal sulcus (pSTS), the middle temporal visual complex (MT), the fusiform gyrus (FG), and portions of the frontal and parietal cortex (Bonda et al. 1996;Grossman and Blake 2002;Saygin 2004Saygin , 2007;;Peuskens et al. 2005;Peelen et al. 2006;Jastorff and Orban 2009;Grosbras et al. 2012;Thompson and Parasuraman 2012;van Kemenade et al. 2012;Yovel and O'Toole 2016;Hirai and Senju 2020;Pitcher and Ungerleider 2021).Among this distributed network, certain brain areas are found to be crucial for the processing of some particular attributes of BM.Facing direction perception is supported by FG (Michels et al. 2009), primary somatosensory cortex, and inferior frontal gyrus (IFG; De Lussanet et al. 2008).Emotional state perception network mainly includes pSTS, FG, temporoparietal junction, and superior frontal gyrus (SFG; Alaerts et al. 2014;Atkinson et al. 2012;Goldberg et al. 2015;Jastorff et al. 2016Jastorff et al. , 2015;;Poyo Solanas et al. 2020).On the other hand, the perceptual and neural representations of distinct BM attributes may not be independent of each other, considering the mutidimensional nature of BM.Indeed, a few behavioral studies have shown that the perception of one BM attribute can be significantly modulated by the processing of the other attribute(s).For instance, emotional states conveyed by BM significantly affect its gender categorization such that angry BMs are overwhelmingly judged to be men while sad BMs are judged to be women (Johnson et al. 2011).Gender information also affects the direction discrimination of BM, with the direction discrimination being worse when the BM is depicted as a male than a female (Yang et al. 2014).Based on these observations, it is well expected that there exists a hierarchical neural mechanism for processing multidimensional BM attributes in the brain.In the present study, we investigated how the brain encodes multidimensional BM attributes and in what order these processes are organized.In particular, whether different attributes are extracted and represented in an independent or feedforward manner (e.g. from physical to biological and social attributes), or the processing of multidimensional attributes is accomplished through a recurrently interactive means.
To investigate this issue, we used functional magnetic resonance imaging (fMRI) to measure the hemodynamic response in the brain when participants viewed the multidimensional BM stimuli.Point-light walkers were employed as stimuli that preserve facing direction, gender, and emotional state attributes.We used representational similarity analysis (RSA, Kriegeskorte et al. 2008) because it has the unique advantage of isolating target attributes from the multidimensional BM stimuli in the brain.We first performed a multiple regression RSA to, respectively, identify the brain network in response to each BM attribute.Then we introduced the scrambled representational dissimilarity matrices (RDMs) in RSA to further investigate how the neural representations of BM attributes were inf luenced by each other.We also used multivoxel pattern analysis (MVPA) methods (Edelman et al. 1998;Kriegeskorte et al. 2008) to explore whether the brain areas identified by RSA were specific to the neural encoding of BM attributes.Lastly, we analyzed the correlations between behavioral results and neural responses by performing an RSA with behavior RDMs.Our study unraveled a distributed and hierarchical neural network dedicated to the neural encoding of multidimensional BM attributes.

Participants
In total, 24 neurologically normal volunteers (12 female, aged 21-30 years) with normal or corrected-to-normal vision participated in the study.Two participants were excluded due to excessive head motion (>2 mm of maximal translation in any direction of x, y, or z or 2 • of any angular motion throughout the scan), and two additional participants were excluded due to their responses in behavioral experiment (see below for details).Participants gave written informed consent prior to participation in the study.The experimental procedures were approved by the institutional review board of the Institute of Psychology, Chinese Academy of Sciences.

Stimuli
Stimuli were point-light BM sequences based on the motion capture data computed from 50 men and 50 women (https://www.biomotionlab.ca/html5-bml-walker/,Troje 2002).Each stimulus was represented by a set of 15 dots and subtended a visual angle of 6.9 o × 3.9 o .We systematically manipulated the three attributes of the stimuli, which were facing direction (left or right), gender (female or male), and emotional state (happy or sad) (Fig. 1A), leading to eight BM sequences.For facing direction, the point-light walkers were represented in 45 o or 135 o view.For gender and emotional state, corresponding parameters of the point-light walkers were selected in ±6 standard deviation increments from an average walker (stimulus 0; for details of the metric and associated assumptions, Troje 2008Troje , 2002)).The stimuli were presented as white dots against a black background, lasting 1 s in each walking cycle.Stimulus presentation and response collection were controlled by Matlab (MathWorks, Inc.) together with the Psychtoolbox extensions (Brainard 1997).

Behavioral experiment
We employed a behavioral experiment prior to the fMRI experiment to test the participants' abilities for discriminating each attribute of BM.On each trial, one of the BM sequences was presented at the upper center of the screen, and a question with options was presented at the lower center of the screen.We used the semantic differential scale, in which an adjective was paired with its antonym, and the two adjectives are assigned numbers on a scale from 1 to 7 (Osgood 1964).Participants were asked to estimate where an attribute of the stimulus is placed on the scale.For example, in a trial, the question was "what is the gender of the person", and the option "pretty confident it is a woman" was given 1, and "pretty confident it is a man" was given 7.Each of the eight BM sequences was estimated by its facing direction, gender, and emotional state four times, leading to a total of 96 trials.Behavioral results were presented in Supplementary Material Table S1.The participants who reached the following two exclusion criteria were excluded and did not participate in the fMRI experiment: (i) the behavioral results exceeded three-sigma limits in at least one attribute; (ii) reporting "4-not sure" in >50% trials for at least one attribute.According to the exclusion criteria, two participants were excluded.One of them had the totally opposite judgment on facing direction attribute, and the other one could not discriminate emotional state in more than half of the trials.

fMRI experiment
In the fMRI scanner, stimuli were back-projected onto a screen (60 Hz frame rate, 1,024 × 768 pixels screen resolution) via a liquid crystal projector and viewed through a mirror mounted on the head coil.fMRI runs were arranged in a block design with each run containing 24 blocks.Each block consisted of a BM sequence lasting 12 s.In every 2 s, the BM stimulus had a spatial jitter of 1 o to reduce the potential adaptation effect (Dubois et al. 2015).Eight BM sequences were repeated three times in random order and were interleaved by 6 s fixation blocks.Participants completed six runs, and each run lasted 7 min and 18 s.To ensure that the participant's attention was focused on the stimulus during scanning, the BM sequence was either forward-walking or backward-walking and randomly changed 1-2 times in each block (Fig. 1B).The participants were required to detect how many times the walking direction of the BM sequence had changed and answered by pressing one of the two keys on a keyboard after the stimulus disappeared.All participants demonstrated good performance, with a mean accuracy of 0.883 ± 0.013 and an average reaction time of 1.004 ± 0.054 s.An additional group of participants evaluated the facing direction, gender, and emotional state attributes when the BM sequence was forward-walking or backward-walking, which verified that walking directions did not significantly inf luence the recognition of these attributes (see Supplementary Material Table S2 for details).

fMRI acquisition
Functional and anatomical data were collected using a 3-Tesla Siemens Prisma scanner at the Beijing Magnetic Resonance Imaging Center for Brain Research.Functional data were collected using a T2 * -weighted echo planar imaging sequence with the following parameters: 78 axial slices (with multiband), repetition time (TR) = 2,000 ms, echo time (TE) = 30 ms, f lip angle = 70 o , field of view (FOV) = 192 × 192 mm 2 , matrix size = 96 × 96, thickness/gap = 2/0 mm.Each fMRI session consists of 219 functional volumes.A 3D T1-weighted magnetization-prepared rapid gradient

Preprocessing and general linear model analysis
The fMRI data were preprocessed using statistical parametric mapping 12 (SPM12, http://www.fil.ion.ucl.ac.uk/spm/,Wellcome Center for Human Neuroimaging, London, United Kingdom).The first three volumes were discarded to avoid T1 saturation.Then the functional images were corrected for slice acquisition time and head motion, and co-registered with the T1 anatomical image.Next, anatomical images were spatially normalized to the Montreal Neurological Institute template, and normalization parameters were applied to the functional images.
After preprocessing, we applied traditional general linear model (GLM)-based analysis.For each participant and each run, beta weights of the experimental conditions were estimated using design matrices containing predictors of the eight stimuli and six head motion parameters in a gray matter mask.An absolute threshold masking value of 0.2 was applied to avoid possible edge effects between different tissue types (Ashburner 2007).The resulting SPM T images for the eight stimuli were used for RSA (Salmela et al. 2018).The beta weights for 8 stimuli × 6 runs were used for the MVPA (Turner et al. 2012).

Multiple regression RSA
To investigate which brain areas were in response to the three attributes of BM, we employed a multiple regression RSA with a searchlight approach using CoSMoMVPA toolbox (Oosterhof et al. 2016).Firstly, we created theoretical RDMs for each attribute of the stimuli (Fig. 1C).Each RDM was an 8 × 8 binary matrix in which 1 corresponded to a between-category stimulus comparison (e.g.male vs. female for gender discrimination) and 0 corresponded to a within-category stimulus comparison (e.g.female vs. female for gender discrimination).This procedure resulted in three theoretical RDMs corresponding to facing direction, gender, and emotional state attributes of the stimuli.To exclude the inf luence of lowlevel visual properties such as retinotopic shape biases across our stimulus categories, we added another predictor based on a model of V1 cortical neurons to the multiple regression RSA.We used the C1 units of hierarchical max-pooling (HMAX) model (Riesenhuber and Poggio 1999) to simulate every BM sequence as the V1 cortical response using software provided by the Center for Biological & Computational Learning at Massachusetts Institute of Technology (http://cbcl.mit.edu/software-datasets/).Because processing by the units is approximated as essentially instantaneous in HMAX model (Serre et al. 2007), we first modeled each frame of the BM sequence and then averaged the response of each frame.Then we derived the V1 RDM using 1-Pearson correlation as a measure of dissimilarity (Fig. 1D).Additionally, we analyzed the correlations between three theoretical RDMs and V1 RDM (Fig. 1).The correlation between V1 RDM and the theoretical RDMs is 0.21 for facing direction, 0.68 for gender, and 0.27 for emotion, respectively.
Next, we calculated the neural RDM.In each searchlight, we created a neural representational similarity matrix (RSM) by calculating the Pearson correlation between the SPM T maps from every item pair and then converting the neural RSM into neural RDM using 1-Pearson correlation.After that, we performed the multiple regression RSA using the three theoretical RDMs and the V1 RDM as the independent variables and the neural RDM as the dependent variable.This method can identify the brain maps for the processing of one attribute and, meanwhile, exclude the inf luence from the other attributes and low-level visual properties.We conducted the multiple regression RSA for each participant within the gray matter mask using a searchlight approach, with each searchlight consisting of 200 voxels (Anderson et al. 2015;Thornton and Mitchell 2017;Liuzzi et al. 2020).The resulting correlation values for each predictor were assigned to the central node of each searchlight, leading to a correlation map for each attribute, separately for each participant.Additionally, we conducted an RSA only with the V1 RDM as a predictor to verify that this model represents the primary visual cortex activity for the BM sequence (see Supplementary Material Fig. S1 for details).We also conducted an RSA with the three theory RDMs but without the V1 RDM as predictors.Results showed similar cortical maps for each attribute except the visual network (Fig. S2 in Supplementary Material).All RSA results were Fisher transformed and then conducted one-sample t-tests.Statistical maps were corrected for multiple comparisons using a cluster-based Monte Carlo simulation algorithm as implemented in the CoSMoMVPA toolbox.We used a threshold of P = 0.001 at the initial voxel-wise, and 5,000 iterations of Monte Carlo simulations (Forman et al. 1995;Goebel et al. 2006).For visualization, maps were projected on a cortex surface of the BrainNet toolbox (Xia et al. 2013).

Scrambled RDM-based RSA and dice coefficient analysis
We further investigated how the neural encoding of an attribute is inf luenced by that of the other two attributes.To do this, we carried out a new set of multiple regression RSAs for the three attributes, respectively.In these RSAs, the target theoretical RDM remained unchanged, and the other two theoretical RDMs were shuff led randomly (Carlson et al. 2013;Liang et al. 2013;Bayet et al. 2020).For example, for the gender attribute, we scrambled the facing direction and emotional state RDMs but retained the theoretical RDM of gender, then recalculated the multiple regression RSA with the searchlight approach.This process was repeated 1,000 times for each attribute and each participant, and the resulting correlation map was transformed into a z-score.Then all the z maps were averaged for each attribute and each participant.After that, the results were entered into one-sample t-tests for each attribute, respectively, and corrected by 5,000 iterations of Monte Carlo simulations, resulting in the group-level maps.Note that the above brain map each ref lected the processing of an attribute without excluding the implied inf luence from the processing of the other attributes.
Then we used the Dice coefficient (DC; Dice 1945) to assess the degree of overlap between the group-level statistic maps of the multiple regression RSA and the RSA with scrambled RDM for each attribute (Gorgolewski et al. 2013;Sair et al. 2016).The formula of DC is 2 × N C /(N 1 + N 2 ), where N C is the number of voxels the two statistics maps share in common, N 1 is the number of voxels in the first map, and N 2 is the number of voxels in the second map.DC ranges from 0 to 1, where 1 indicates complete congruence between the number and location of voxels in both threshold maps, while 0 indicates no congruence (Dice 1945;T.A. Sørensen 1948).We used this analysis to evaluate the extent to which the neural encoding of one attribute is modulated by that of the other two.Specifically, if the processing of an attribute is completely not inf luenced by the others, the brain maps of the multiple regression RSA and the RSA with scrambled RDM would perfectly overlap, leading to a DC of 1.On the contrary, if the processing of an attribute is dependent with the processing of the others, the brain maps of the multiple regression RSA and the RSA with scrambled RDM would have no overlapped areas, leading to a DC of 0. Between 0 and 1, the smaller the inf luence of other attributes, the larger the DC value will be.

Multivoxel pattern analysis
We used MVPA to measure whether the significant clusters identified by multiple regression RSA were specific to the neural encoding of the corresponding BM attributes.We first extracted the beta map of 8 stimuli × 6 runs from GLM analysis, resulting in a total of 48 beta maps for each participant.These beta maps were split into two parts corresponding to the categories for facing direction (left vs. right), gender (male vs. female), and emotional state (happy vs. sad), respectively.For example, when classifying gender, the 48 beta maps were split into 24 female beta maps and 24 male beta maps.Then we conducted the MVPA for each attribute and each participant within the gray matter mask using a searchlight approach, with each searchlight consisting of 200 voxels.On each searchlight, we demeaned the data by subtracting the mean beta value from each beta value of the individual voxel to reduce the amplitude effects of different conditions.We used a leave-one-run-out cross-validation method, so that for each iteration, we trained a linear support vector machine (Chang and Lin 2011) classifier using data from five fMRI runs and tested the classifier with the data from the one remaining run.After that, a whole-brain map for each participant was defined in which the center voxel of each searchlight was labeled according to classification accuracy, and then the classification accuracy of each cluster identified by multiple regression RSA was defined as the mean classification accuracy of all voxels located in this cluster (the whole-brain searchlight MVPA results were presented in Supplementary Material Fig. S3).The clusters' accuracies were entered into a one-sample t-test against chance (50%, Fuelscher et al. 2019;Lee et al. 2012;Sapountzis et al. 2010).We also used paired t-test to compare the clusters' accuracies between the two attributes pairs and corrected the results by FDR correction (P < 0.05).Results were projected on a cortex surface of the Brain-Net toolbox on three levels: (i) the classification accuracy of the corresponding attribute is significantly above chance level, (ii) the classification accuracy of one attribute is significantly higher than one of the other attributes, (iii) the classification accuracy of one attribute is significantly higher than both of the other two attributes.

RSA between neural and behavioral RDMs
We investigated whether the fMRI results were related to behavioral judgments.To this end, we used the responses obtained in the behavioral experiment to define the behavioral RDM for each participant.We derived the behavioral RDM of each attribute by using the Euclidean distance as the distance metric of the judgments between each pair of stimuli.Then we conducted an RSA with behavioral RDMs as predictors in each cluster identified in multiple regression RSA for each participant, respectively.Then, the results were Fisher transformed and entered into a onesample t-test and corrected by FDR correction (P < 0.05) across clusters.

Brain networks for encoding different BM attributes
The multiple regression RSA revealed the brain regions encoding facing direction, gender, and emotional state (Fig. 2; corrected by a cluster-based Monte Carlo simulation with a threshold of P = 0.001 at the initial voxel-wise and 5,000 iterations).Specifically, the brain regions encoding the facing direction information involved the bilateral lingual gyri (Brodmann areas [BA] 17), left middle occipital gyrus (MOG, BA 37), right supper occipital gyrus (SOG, BA18), bilateral FG (BA 18), bilateral middle temporal gyri (MTG, BA 19), left inferior parietal lobules (IPL, BA 40), bilateral superior parietal lobes (SPL, BA 7), bilateral postcentral gyri (BA 3), bilateral precentral gyri (BA 4), left supplementary motor area (SMA, BA 6), bilateral insula (BA 13), bilateral IFG (BA 47), and right anterior cingulate cortex (ACC, BA 10; Fig. 2A).The brain regions encoding the gender information involved the right lingual gyrus (BA 17),  2C).These results together demonstrated that the respective neural encoding of the facing direction, gender, and emotional state information embedded in BM stimuli involved considerably overlapping brain regions, which raised the possibility that the neural representations of the three attributes of BM might be shared and possibly interact with each other.

Hierarchical neural encoding among the three BM attributes
We further investigated how the neural encoding of an attribute is inf luenced by that of the other attributes.To do this, for each attribute, we conducted a new searchlight multiple regression RSA in which the theoretical RDM of the target attribute remained unchanged, and the other two theoretical RDMs were scrambled.This procedure was repeated 1,000 times and then the results were averaged.If the neural encoding of the target attribute is inf luenced by the processing of the other two attributes, then the beta value for the target attribute would be different between the RSA results before and after scrambling manipulation, resulting in a variation in the identified statistic maps by the searchlight approach.DCs evaluated the extent to which the neural encoding of one attribute is modulated by that of the other two.Figure 2 shows the DC for the pairs of analyses before and after scrambling manipulation on group-level threshold statistic maps.The DC is 0.394 for facing direction (Fig. 2D), 0.442 for gender (Fig. 2E), and 0.751 for emotional state (Fig. 2F).These results demonstrated that the neural encodings of all attributes were inf luenced by each other to some extent, that is, the neural encodings of the three attributes were recurrently interactive.Among the three attributes, the variations of their neural encodings, inversely related to the DC values, before and after scrambling manipulation were in descending order for facing direction, gender, and emotional state.

Brain areas specific to the classifications of BM attributes
To further explore whether the brain areas in the BM attribute encoding networks were specific to the corresponding attribute classifications, we performed MVPA on the clusters revealed in RSA to measure the classification accuracies for discriminating the three attributes.On most clusters, the classification accuracies for the corresponding attribute were significantly above chance level (50%).A portion of the brain areas showed significantly higher classification accuracies for the corresponding attribute than for the other attributes.We displayed the statistical results on brain maps (Fig. 3).In the facing direction encoding network, classification accuracies were significantly above chance level in the bilateral lingual gyri (BA 17), left MOG (BA 37), right SOG (BA 18), bilateral MTG (BA 19), bilateral FG (BA 18), bilateral SPL (BA 7), bilateral postcentral gyri (BA 3), left precentral gyrus (BA 4), left SMA (BA 6), bilateral insula (BA 13), right IFG (BA 47), and right ACC (BA 10, P < 0.05, FDR correction).Among these regions, the classification accuracies of facing direction were significantly higher than one of the other two attributes in the left MOG (BA 37), left SPL (BA 37), bilateral postcentral gyri (BA 3), Fig. 3. MVPA results in the significant clusters defined in RSA of facing direction (A), gender (B), and emotional state (C).The light colors mean the classification accuracy was significantly higher than chance level (50%).The medium colors mean the classification accuracy was significantly higher than one of the other attributes.The dark colors mean the classification accuracy was significantly higher than both of the other two attributes.Results were corrected by FDR correction (P < 0.05).left precentral gyrus (BA 4), left SMA (BA 6), and left insula (BA 13), and significantly higher than both of the other two attributes in the left MTG (BA 19), right SOG (BA 18), bilateral lingual gyri (BA 17), and bilateral FG (BA 18).In the gender encoding network, classification accuracies were significantly above chance level in the right lingual gyrus (BA 17), left FG (BA 37), right ITG (BA 37), left MTG (BA 19), right pSTS (BA 22), right postcentral gyrus (BA 2), right SMA (BA 6), bilateral insula (BA 13), right IFG (BA 47), and right MFG (BA 6, P < 0.05, FDR correction).Among these regions, the classification accuracies of gender were significantly higher than one of the other two attributes in the right lingual (BA 17), left FG (BA 37), left MTG (BA 19), right postcentral gyrus (BA 2), left insula (BA 13), and right IFG (BA 47), but no cluster has significantly higher classification accuracies than both of the other two attributes.In the emotional state encoding network, classification accuracies were significantly above chance level in the bilateral lingual gyri (BA 17,18), bilateral IOG (BA 18), bilateral MOG (BA 19),bilateral SOG (BA 7,18,19),bilateral FG (BA 19), bilateral ITG (BA 37), bilateral MTG (BA 39), bilateral IPL (BA 40), bilateral SPL (BA 7), bilateral postcentral gyri (BA 3), bilateral precentral gyri (BA 6), left SMA (BA 6), right SFG (BA 11), right MFG (BA 6), and right IFG (BA 46, P < 0.05, FDR correction).Among these regions, the classification accuracies of emotional state were significantly higher than one of the other two attributes in the left postcentral gyrus (BA 3), left SMA (BA 6), and right FG (BA 19), and significantly higher than both of the other two attributes in the bilateral lingual gyri (BA 17, 18), bilateral IOG (BA 18), bilateral MOG (BA 19), bilateral SOG (BA 7,18,19), left FG (BA 19), left ITG (BA 37) bilateral MTG (BA 39), and bilateral SPL (BA 7).

Correlation between neural representations and behavioral results
Not all information that can be read out from brain activity is directly used by the brain to guide behaviors.To investigate whether the brain regions revealed by RSA were correlated with participants' behavioral discrimination of each attribute, we calculated the correlation between behavioral RDMs and neural RDMs in the clusters identified in the multiple regression RSA. Figure 4 shows the correlation on each cluster (P < 0.05, FDR correction).Most of the clusters showed significant correlations between behavioral and neural responses.

Discussion
The present study investigated the brain encoding of multidimensional BM information.Using multiple regression RSA, we identified distributed brain networks related to facing direction, gender, and emotional state processing.These brain areas were governed by a recurrently interactive mechanism that the processing of each attribute was inf luenced by the others, leading to a hierarchical structure: the respective neural encoding of facing direction, gender, and emotional state is modulated by each other in descending order.Among the three attribute encoding networks, a portion of the brain areas was specific to the classification of the corresponding attribute.Most of these areas showed significant correlations between behavioral and neural responses.Taken together, these results revealed the distributed and hierarchical brain mechanisms of multidimensional BM processing and provided constraints on computational models of BM perception.

Recurrently interactive multidimensional BM attribute encoding
While a few prior studies have investigated the brain substrates underlying particular BM attribute perception or its cognitive effect (De Lussanet et al. 2008;Michels et al. 2009;Atkinson et al. 2012;Alaerts et al. 2014;Goldberg et al. 2015;Jastorff et al. 2015;Poyo Solanas et al. 2020), they have focused on the processing of a single attribute of BM, and hence could not address in what order different BM attributes are encoded in the brain and how they interact with each other.The current study extended previous findings and, for the first time, investigated the brain network related to the neural encoding of BM attributes from a multidimensional perspective.Although different BM attributes convey different levels of information, the current study demonstrated that the neural encodings of different BM attributes are recurrently interactive.It has been long assumed that low-level and simple features are extracted early in the visual system while high-level and complex attributes are processed at a relatively late stage of visual processing (Rousselet et al. 2004).Among BM attributes, facing direction conveys relatively "low-level" orientation information, which can be similarly transmitted by a lot of inanimate objects such as moving cars or moving balls.Gender conveys biological or physiological information that is specific to living beings.The emotional state represents relatively "high-level" mental state information.Recently, researchers have proposed a dual-process theory for BM perception, which suggests that detecting facing direction is a rapid preattentive process while evaluating gender and emotional state requires more cognitive processes (Hirai and Senju 2020).However, our DC analysis demonstrates that the processes of BM attributes are recurrently inf luenced by each other rather than in a simply independent or feedforward manner.
The DC analysis also reveals a hierarchical structure among the three attributes.The DC for facing direction is minimum, suggesting its neural encoding is more inf luenced by the processing of the other attributes.The DC for emotional state is maximum, suggesting its neural encoding is less inf luenced by the processing of the other attributes.This pattern of results is in accordance with previous behavioral findings BM perception.Although it has long been thought that the gender information of both the agent and the observer could affect the perception of emotional state (Cutting and Kozlowski 1977;Johnson et al. 2011;Kret et al. 2011), researchers also found that the emotional state of BM could affect the gender judgment (Johnson et al. 2011).Furthermore, the gender of the BM stimulus has been shown to modulate the perception of its facing direction (Yang et al. 2014).

Brain network as a function of BM attribute extraction
This study revealed extensive networks involved in extracting facing direction, gender, and emotional state from BM.Among these networks identified by RSA, MVPA further confirmed that the classification accuracies for the corresponding attributes were significantly above chance level in a bulk of areas.More importantly, in some areas, the classification accuracies for the corresponding attribute were significantly higher than the other attributes, implying that the neural responses were more distinguishable for the corresponding attribute.On the other hand, to further explore whether the fMRI signals related to behavioral results, we correlated neural RDMs to behavioral RDMs and found that most of the brain areas in the networks identified by RSA showed significant correlations.This pattern of results provides a preliminary link between the behavioral performance and the corresponding neural substrates for processing BM attributes.
Among the facing direction encoding network, the roles of the lingual gyri, SOG, FG, SPL, postcentral gyri, and IFG seem to be crucial.MVPA revealed that the facing direction classification accuracies were significantly above chance level in the right IFG, higher than one of the other attributes in the left SPL and bilateral postcentral gyri, and higher than the other two attributes in the bilateral lingual gyri, right SOG, and bilateral FG.The neural responses of all the above regions were significant correlated with behavioral results.Additionally, in our results, the bilateral FG, bilateral postcentral gyri, and bilateral IFG have been reported in previous studies on BM facing direction (De Lussanet et al. 2008;Michels et al. 2009).These results share a partially overlapping brain network with those obtained from face and body perception.The brain regions representing the facing direction of bodies involved the FG (Taylor et al. 2010;Vangeneugden et al. 2012;Bellot et al. 2021) and pSTS (Vangeneugden et al. 2014).Recently, researchers have demonstrated a stimulus-independent neural code for the facing direction of face and body in the occipitotemporal cortex, including the occipital face area, extrastriate body area, lateral occipital complex, and early visual cortex (Foster et al. 2022).These brain regions are close to the FG and MTG in the facing direction network observed in the current study.Besides, considering the finding in the previous electroencephalography (EEG) study that the facing direction of BM can trigger an early directing attention negativity in the occipito-parietal electrodes (PO5/6 and PO7/8; Wang et al. 2014), we highlight the role of SPL.Moreover, previous case studies have also provided evidence that patients with parietal lobe lesions have difficulty in discriminating the facing direction of BM, but not the direction of low-level motion (Battelli et al. 2003).
Among the gender encoding network, MVPA revealed that in the right pSTS, left FG, left MTG, left insula, and right lingual gyrus, the gender classification accuracies were significantly above chance level and higher than one of the other attributes.The right lingual gyrus, left FG, and left MFG also encoded the similarity patterns revealed by the behavioral experiment.These results share a partially overlapping brain network with previous results on face and body processing.It is reported that the brain regions representing the gender of face and body include the FG (Contreras et al. 2013), pSTS, insula, andSMA (Kret et al. 2011).
Among the emotional state encoding network, the MOG, ITG, MTG, and SPL might be crucial.MVPA revealed that in the bilateral MOG, ITG, MTG, and SPL, the emotional state classification accuracies were significantly above chance level and higher than other attributes.In the bilateral MOG, bilateral ITG, bilateral MTG, and bilateral SPL, the correlations between neural responses and behavioral results were significant.Additionally, the FG, ITG, MTG, precentral gyrus, and IFG have also been reported in previous studies (Jastorff et al. 2015;Ross et al. 2019;Poyo Solanas et al. 2020).Precious studies have revealed a broad network for emotional state processing on other socially relevant information.Concentrated on the studies about happy and sad emotions, we concluded that the brain regions processing emotional information include the MFG (Kesler-West et al. 2001;McLellan et al. 2012), dorsolateral prefrontal cortex (Vanderhasselt et al. 2011), SFG, IFG (McLellan et al. 2012), ACC (Yoshino et al. 2010), amygdala (Gaffrey et al. 2011), andinsula (Hall et al. 2014).It seems that emotional state processing involves complex neural mechanisms that are inf luenced by many factors, such as emotion types and the expressing moods.

A distributed and hierarchical neural network for BM processing
To the best of our knowledge, three main neural models have been proposed to account for BM processing.Giese and Poggio (2003) proposed a hierarchical and feedforward computational model with two parallel pathways for the processing of the form and motion of BM.Both pathways consist of several levels and finally converge at the STS.This model does not involve the processing of different BM attributes.Lange and Lappe (2006) proposed a two-stage computational model.The first stage analyzes the form information in each frame, then the second stage analyzes the temporal order of the selected frames, leading to the output on the global motion aspects of the stimulus.This model can well explain the processing of facing direction, but hardly expand to the processing of other attributes.Hirai and Senju (2020) proposed a two-process model for BM perception.While this model supposed that the first system detects foot motion (thus process facing direction) and the second system evaluates global bodily actions (thus process gender and emotional state), the main purpose of the model was to separate the rapid, subcortical system from the slow, cortical system.Here we developed a theoretical model for multidimensional BM processing based on our results and the findings from previous studies.
BM perception, as a crucial visual process for socially relevant information, is analogous to face perception, which is accomplished by a dynamic and hierarchical neural system (Giese and Poggio 2003;Hirai and Senju 2020).Following the model of distributed neural system for multidimensional face perception (Haxby et al. 2000), we propose a model that mediates the perception of multidimensional BM information (Fig. 5).The model has a branching structure: a core system for the visual analysis of BM is distinguished from an extended system that processes the attributes gleaned from BM.The core system comprises three brain regions: the pSTS, the MT+, and the FG.The current study demonstrates that these regions are substantially involved in the neural representations of all the three BM attributes.Anatomical configuration suggests a hierarchical organization among these regions, in which the middle temporal cortex may provide input to the lateral FG and STS (Gauthier and Logothetis 2000;Cross et al. 2010).Functional connectivity studies of BM processing suggest that the right FG, MT+, and STS are functionally integrated (Dasgupta et al. 2017;Sokolov et al. 2018).Among the core system, the pSTS is the core brain area dedicated to BM processing (Allison et al. 2000;Blake and Shiffrar 2007;Yovel and O'Toole 2016), whose causal role has been assessed in brain stimulation (Grossman et al. 2005;van Kemenade et al. 2012) and brain lesion (Saygin 2007) studies.The pSTS not only selectively responds to BM compared with non-BM stimuli (Bonda et al. 1996;Howard et al. 1996;Chang et al. 2018), but also exhibits special responses across different types of BM stimuli (Frith and Frith 2010).Accumulating evidence also suggests the critical functions of MT+ (Herrington et al. 2007;Jastorff and Orban 2009) and FG (Vaina et al. 2001;Peelen et al. 2006;Lichtensteiger et al. 2008;Michels et al. 2009;Sokolov et al. 2018) in BM processing.The MT+ is a vital region for analyzing the kinematic cues of BM Fig. 5.The model of a distributed and hierarchical system for BM attribute representations.The core system (in orange color) is composed of pSTS, MT+, and FG.The extended system for facing direction representation (in blue color) is composed of SPL, postcentral gyrus, precentral gyrus, and IFG.The extended system for gender representation (in red color) is composed of SMA, insula, and MFG.The extended system for emotional state representation (in green color) is composed of MOG, ITG, SPL, amygdala, and SFG.(Jastorff and Orban 2009), while the FG plays a key role in body form processing (Thompson and Parasuraman 2012).
The extended system consists of distinct brain regions for further processing the different BM attributes, in concert with other neural networks.The neural network for processing facing direction includes the SPL, postcentral and precentral gyri, and IFG.The neural activity in the precentral and postcentral gyri showed higher classification accuracies for facing direction than the other attributes and correlated with behavioral results.Moreover, the facing direction of BM is an important social cue that can induce a ref lexive attentional orienting effect in the occipitoparietal region (Shi et al. 2010;Wang et al. 2014).Facing direction thus provides a basis for implying the goal of an individual, which might be extracted in the IFG (De Lussanet et al. 2008;Thompson and Parasuraman 2012).The neural network for processing gender includes the SMA, insula, and MFG.These brain regions were revealed by the present study as well as reported in previous studies on gender perception (de Gelder et al. 2010;Kret et al. 2011).The neural network for processing emotional state includes the MOG, ITG, SPL, SFG, and amygdala.The ITG is identified as a relevant region for the perception of human bodies and is directly neighbored by the FG (Weiner and Grill-Spector 2011), and shows increased activity in response to emotional body expressions (de Gelder et al. 2004;Prochnow et al. 2013).The SPL is found to be activated by the visual stimulation of socio-emotional stimuli (de Gelder et al. 2015).SFG is reported in previous studies on emotional state perception (Mak et al. 2009;McLellan et al. 2012).In the current study, the SFG was identified in multiple regression RSA for emotional state and correlated with behavioral results.The amygdala is considered a hub region for emotional processing (Phelps 2006;Sergerie et al. 2008;Andrewes and Jenkins 2019).Therefore, we take it into the emotional state representation network in the extended system.However, our RSA did not reveal the amygdala in emotional BM processing, which is likely due to that the emotions employed in the current study were happy and sad, and did not evoke strong neural activation in the amygdala.Previous studies that reported amygdala activations in BM emotional processing usually adopted fearful or angry emotions (Jastorff et al. 2015;Poyo Solanas et al. 2020).Therefore, further work is required to employ a variety of emotion types to verify the emotional state representation network for BM perception.
Our model is the first to consider the neural mechanism for the processing of BM attributes.Previous models primarily described neural processing from the perception of basic visual features (e.g.form, local motion) to the recognition of integrated BM information.The advantage of our model is that it provides a different perspective for BM processing, which concentrates on the neural encodings of distinct physical, biological, and social attributes from BM signals.Future directions may include analyzing the functional connections of distinct BM attribute representations in the model and linking them with behavioral performance, as well as further testing the model's predictive capabilities.

Study limitations
The main limitation of the present study was that only three BM attributes were employed.The human visual system can extract and process many other attributes of BM, such as identity (Ng et al. 2006;Westhoff and Troje 2007;Baragchizadeh and O'Toole 2017), personality trait (Klüver et al. 2016), age (Montepare and Zebrowitz-McArthur 1988), and familiarity (Hahn and O'Toole 2017).Thus, further studies could simultaneously consider more attributes to investigate how the brain encodes distinct physical, biological, and social attributes from BM. Besides, a larger sample size may be necessary for a study employing more BM attributes.
While we discovered a hierarchical structure for processing multidimensional BM, we were unable to determine how the processing of distinct BM attributes unfolds over time due to the limited temporal resolution of fMRI.Further studies utilizing techniques such as magnetoencephalography (MEG) or EEG could shed light on the temporal characteristics of these processes.
Our results revealed significant correlations between behavioral and neural responses in most of the regions of the three BM attributes networks.However, the current behavioral experiment had two limitations.Firstly, the behavioral results relied on subjective evaluation of BM attributes, which might be inf luenced by individuals' judgment criteria.Future studies could employ an objective BM attribute discrimination paradigm.Secondly, the behavioral experiment was conducted before the fMRI scanning, and a more direct link between behavioral and neural responses could be obtained from concurrent behavioral responses during fMRI scanning.

Conclusions
The brain representations of multidimensional BM engage a distributed and hierarchical network, which consists of a core system (i.e. the pSTS, MT+, and FG) and an extended system that processes the distinct attributes of BM.The neural encodings of different BM attributes are governed by a recurrently interactive mechanism that the processing of each attribute is inf luenced by the others, leading to a hierarchical structure in which the respective neural representation of facing direction, gender, and emotional state is modulated by each other in descending order.

Fig. 1 .
Fig. 1.Stimuli, task, and RDMs.(A) Illustration of a single frame of the eight BM sequences.F: Female; M: Male; H: Happy; S: Sad; L: Left; R: Right.(B) Schematic representation of the fMRI experiment.(C) Theoretical RDMs for each attribute.0 means within category and 1 means between categories.(D) V1 RDM calculated by the HMAX model.

Fig. 2 .
Fig. 2. RSA results of multiple regression and random results.Group-level results of the searchlight-based multiple regression RSA of facing direction (A), gender (B), and emotional state (C).The group-level results of 1,000 times random RSA results of facing direction (D), gender (E), and emotional state (F).The light purple color shows the results of multiple regression RSA.The green color shows the random RSA results.The dark purple color shows the overlapping results of multiple regression RSA and random RSA.DC between the multiple regression RSA maps and the random RSA maps.Statistical maps show the clusters after Monte Carlo simulations (P = 0.001 at the initial voxel-wise, 5,000 iterations).

Fig. 4 .
Fig. 4. Correlations between behavioral and neural RDMs on the significant clusters defined in RSA.Behavioral RDM for facing direction (A), gender (B), and emotional state (C) of a representative participant.The correlation results for facing direction (D), gender (E), and emotional state (F).Results were Fisher transformed.* P < 0.05 after FDR correction.